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ABSTRACT 


Determining  the  extent  to  which  factors  obtained  from  two 
different  studies  are  similar  has  plagued  researchers  for  some  time. 
The  degree  of  factorial  similarity  was  usually  assessed  by  the  use  of 
various  indices  (Tucker,  1951;  Kaiser,  I960;  Wrigley  and  Neuhaus, 
1955;  and  Harman  1967)  or  by  the  magnitude  of  the  angles  through 
which  one  factorial  structure  would  have  to  be  rotated  to  approximate 
another  structure.  Regardless  of  the  technique  employed  for  assess¬ 
ment,  neither  provided  an  adequate  solution  to  the  problem  of 
factorial  invariance. 

Using  a  Monte  Carlo  approach,  the  present  study  attempted 
to  develop  an  empirical  sampling  distribution  from  one  type  of  factor 
match  procedure,  namely  the  Orthogonal  Procrustes. 

Twenty -two  different  component  structure  matrices  (A) 
ranging  in  order  from  five  variables  by  three  components  to  twenty 
variables  by  six  components  served  as  population  A  matrices.  One 
randomly  generated  component  score  matrix  (F)  of  order  one 
thousand  observations  by  twenty  variables  provided  the  sample 
source  in  the  study.  Employing  the  Orthogonal  Procrustes  factor 
match  procedure  and  a  constant  sample  size  of  fifty,  one  thousand 
matches  were  performed  for  each  A  matrix.  The  average  trace  of 
(E'E)  and  the  "largest  absolute  value"  of  the  error  matrix  were 
computed.  Following  the  calculation  of  the  frequency  distributions 
of  average  trace  of  (E'E)  where  E  is  defined  as  the  difference 
between  the  rotated  source  and  the  target  matrices  and  the  "largest 
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absolute  value"  for  the  one  thousand  matches,  the  25,  50,  7  5,  90,  95, 
and  99  percentile  points  were  calculated. 

For  several  of  the  A  matrices,  the  procedure  was  repeated 
using  sample  sizes  of  one  hundred  and  one  hundred  and  fifty.  In  such 
cases,  only  five  hundred  matches  were  performed. 

The  values  for  the  various  percentiles  are  presented  and  can 
be  used  as  a  guideline  in  evaluating  the  significance  of  the  results 
obtained  from  matching  two  data  sets  when  the  Orthogonal  Procrustes 
factor  match  is  used. 

In  addition,  cumulative  frequency  polygons  and  frequency 
polygons  of  the  average  trace  of  (E'E)  values  were  plotted  for  various 
matrices  to  give  an  indication  of  the  shape  and  distribution  of  the 
graph. 
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CHAPTER  I 


INTRODUCTION  AND  STATEMENT  OF  PROBLEM 


Factor  analysis  is  a  method  for  extracting  information  from 
a  data  set.  One  begins  with  a  set  of  observations  on  a  number  of 
variables  and  then  by  analyzing  the  intercorr elations  of  the  variables, 
tries  to  express  the  measures  in  terms  of  a  smaller  number  of 
reference  variables.  These  reference  variables  have  been  called 
factors  or  components. 

The  term  "factor  analysis"  is  often  used  ambiguously.  It  is 
used  generically  to  refer  to  the  representation  of  a  variable  in  terms 
of  several  underlying  variables,  and  it  is  used  specifically  to  refer  to 
an  analytic  procedure  which  best  reproduces  the  observed  correla¬ 
tions.  The  latter  use  of  the  term  is  in  contrast  to  "component 
analysis"  whose  purpose  is  to  extract  maximum  var iance.  The 
present  study  is  concerned  specifically  with  component  analysis. 
Where  the  term  factor  is  used,  it  is  used  in  its  generic  sense. 

One  of  the  most  important  problems  that  has  yet  to  be 
solved  by  psychometricians  is  the  problem  of  determining  the  extent 
of  comparability  of  factors  obtained  from  two  different  sets  of  data. 
Based  upon  the  nature  of  the  data,  four  cases  may  be  distinguished 
depending  upon  whether  the  same  or  different  variables  and  whether 
the  same  or  different  individuals  have  been  observed  in  the  two 
data  sets. 

The  most  common  situation  and  the  only  one  of  interest  in 
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the  present  investigation  is  the  situation  in  which  the  same  variables 
and  different  individuals  form  the  basis  of  the  analysis.  Specifically 
the  problem  is  one  of  deciding  whether  a  component  found  in  one 
sample  can  be  identified  with  that  found  in  another  sample.  Methods 
of  solution  for  this  case  have  been  developed  by  Burt  (1948),  Tucker 
(1951),  Ahmavaara  ( 1954),  Wrigley  and  Neuhaus  (1955),  Kaiser 
(I960),  and  Schonemann  (1966). 

In  the  early  exploratory  studies,  generally  rough  methods 
of  inspection  and  personal  impressions  sufficed  as  the  basis  for 
identification  of  similar  factors  in  separate  studies.  In  1945, 

Zachert  and  Friedman  (1953)  administered  the  23  variable  Aircrew 
Classification  Battery  to  8574  aviation  trainees.  Two  years  later, 
the  same  battery  was  administered  to  1511  pilot  trainees.  They  then 
set  out  to  compare  the  factorial  content  of  the  battery  on  war-time 
and  post-war  samples.  The  factor  analyses  of  the  two  batteries 
showed  that  the  factorial  contents  for  both  samples  were  similar. 
They  used  as  a  criterion  of  stability,  loadings  for  variables  that 
were  at  least  .  30  in  both  samples. 

Burt  (1948)  in  his  study  of  emotional  traits  of  children 
proposed  the  "proportionality  criterion"  for  assessing  the  amount 
of  agreement  between  factors  derived  from  different  factorial 
analyses.  A  set  of  factor  coefficients  of  twelve  temperamental 
traits  for  a  general  emotionality  factor  was  compared  with  a 
teacher's  set  of  independent  gradings  for  general  emotionality.  A 
value  of  .931  was  obtained. 


Tucker  (1951)  analyzed  two  studies  -  one  involving  18 
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variables  for  a  sample  of  Naval  Recruits  and  the  other  involving  44 
variables  for  a  sample  of  Aircrew  and  Soldiers.  Ten  variables  were 
common  to  both  samples  and  six  factors  of  the  smaller  study  were 
matched  with  six  of  the  twelve  factors  of  the  larger.  Coefficients  of 
congruence  as  proposed  by  the  same  author  were  calculated  for  the 
six  factors  and  those  ranging  from  .  999984  down  to  .  939811  were 
accepted  as  defining  congruent  or  similar  factors.  A  value  of 
.  459717  was  rejected  as  being  too  low  to  warrant  congruence  of 
factor  s. 

Wrigley  and  Neuhaus  (1955)  presented  the  same  formula  as 
Tucker  had  for  measuring  the  degree  of  factorial  similarity. 

More  recently,  Quereshi  (1987)  investigated  the  invariance 
of  certain  ability  factors  with  respect  to  consistency  of  factors  for  a 
fixed  set  of  variables  and  different  samples.  The  data  represented 
the  performance  of  700  children,  comprising  seven  independent 
samples  of  100  subjects  each  on  the  Illinois  Test  of  Psycholinguistic 
Abilities  and  the  Standford-B inet.  Coefficients  of  congruence  as 
defined  by  Burt  (1948),  Tucker  (1951),  and  Wrigley  and  Neuhaus 
(1955)  were  calculated  across  the  seven  samples  on  four  factors. 

The  author  concluded  that  since  78  or  84  coefficients  were  close  to 
unity,  it  would  be  justifiable  to  say  that  four  factors  possessed  a 
high  degree  of  stability  across  the  samples. 

Ahmavaara  (1957)  applied  his  method  of  transformation 
analysis  which  he  proposed  in  1954  to  a  comparison  of  factors  as 
revealed  by  factorial  analysis  of  ability.  The  degree  of  factorial 
similarity  was  attained  by  plotting  the  transformed  matrix  against 


4 


the  target  matrix  and  observing  whether  the  points  fall  along  a  line 
y  =  x.  The  author  also  suggested  that  the  average  sum  of  squares 
for  the  difference  between  the  transformed  and  target  matrix  may  also 
be  computed. 

Hereford  ( 1968)  employed  a  semantic  differential  test  to 
determine  the  component  structure  of  the  responses  made  by  three 
categories  of  listeners  to  different  kinds  of  music.  She  then  employed 
Ahmavaara's  technique  to  assess  the  degree  of  similarity  between 
factors  for  the  different  groups.  Judgments  were  made  based  on  the 
size  of  the  values  in  the  transformation  matrix.  The  writer  stated 
that  values  of  .  9  or  greater  indicated  a  strong  correspondence 
between  factors  while  values  below  .  8  indicated  a  lower  correspond¬ 
ence  between  the  factors. 

Kaiser  (I960)  related  Thurstone's  four  rotated  primary 
factors  obtained  from  anthropometric  measurements  on  100  adult 
North  Ireland  males  with  two  orthogonal  varimax  factors  based  on 
14  measurements  on  50  London  University  male  students.  Six  vari¬ 
ables  were  common  to  both  samples.  The  "qualify  of  fit"  of  the 
factors  between  the  two  samples  was  ascertained  by  the  mean  cosine 
index  and  a  value  of  .  85  was  obtained.  Kaiser  concluded  that  the 
value  of  .  85  suggested  that  the  pairing  of  the  six  variables  was 
appropriate  and  that  the  relationship  confirmed  the  hypothesis  of 
similar  factors  in  both  studies. 

Stewin  (1969)  employed  Kaiser's  method  in  matching  two 
samples  which  were  exposed  to  a  battery  purporting  to  measure 
conceptual  system  functioning.  Rather  than  computing  a  mean 
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cosine  index,  Stewin  based  his  conclusions  upon  the  size  of  the  cosine 
values  in  the  transformation  matrix.  Values  of  .  800  or  greater  were 
accepted  as  significant  and  indicated  that  factors  were  similar  in  both 
sample  s. 

In  a  study  intended  to  examine  the  relationship  between 
environmental  variables  and  different  mental  abilities,  Mosychuk 
(1969)  administered  the  Wechsler  Intelligence  Scale  for  Children 
(WISC)  to  100  ten  year  old  boys  and  interviewed  the  mothers  with  the 
Differential  Environmental  Process  Variables  (DEPVAR)  Interview 
Schedule.  The  sample  of  100  was  then  divided  into  two  groups  of  50 
and  analyses  were  conducted  on  each  group.  Responses  on  the  WISC 
and  DEPVAR  were  factor  analyzed.  Group  two  results  were  matched 
with  the  target  Group  One  using  Kaiser's  procedure  (I960).  Values 
in  the  transformation  matrix  of  .  866  and  greater  were  accepted  as 
indicative  of  favorable  matching. 

The  Orthogonal  Procrustes  solution  is  another  method  which 
can  be  used  for  rotating  a  source  matrix  of  factor  loadings  to  a  target 
matrix.  Such  a  solution  was  derived  by  Schonemann  (1966). 

Two  samples  of  Illinois  high  school  teachers  were  employed 
to  rate  two  sets  of  fifteen  educational  objectives  on  thirty  scales  in  a 
study  conducted  by  Maguire  (1967).  Six  components  based  on  the 
factoring  of  the  intercorrelations  among  scales  were  obtained. 
Schdnemann1  s  Orthogonal  Procrustes  solution  was  used  to  rotate  the 
structure  of  sample  two  to  that  of  sample  one.  Using  the  criterion 
that  each  row  or  column  of  the  transformation  matrix  should  contain 
one  element  near  plus  or  minus  one  and  the  rest  near  zero,  the 
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author  concluded  that  four  of  the  six  components  displayed  stability 
across  different  sets  of  teachers  and  different  sets  of  objectives. 

In  1967,  Taylor  and  Maguire  tried  to  determine  what 
differences  existed  among  perceptions  of  broad  objectives  of  science 
curriculum  by  three  groups  of  people.  A  semantic  differential 
consisting  of  eighteen  concepts  and  twenty- seven  scales  was  rated  by 
three  groups  of  people  labelled  as  teachers,  writers,  and  experts. 
Component  structures  were  obtained  for  each  group  and  the  fit  of 
pairs  of  component  structures  was  determined  using  Schonemann's 
procedure.  The  authors  concluded  that  no  major  differences  existed 
between  the  groups'  perceptions  of  science  objectives.  The  conclusion 
was  based  on  the  fact  that  values  for  the  solution  of  the  Procrustes 
problem  were  close  to  unity. 

In  many  of  the  studies  described  above,  degree  of  factorial 
similarity  has  been  assessed  using  the  magnitude  of  the  angles  through 
which  one  structure  would  have  to  be  rotated  in  order  to  more  closely 
match  a  second  structure.  However  the  structures  themselves  are 
generally  rotations  of  principal  axes  solution  and  therefore  they  are 
in  one  sense  arbitrary.  The  "goodness"  of  a  rotation  is  most  often 
a  configural  judgment.  That  is,  "goodness"  is  measured  by  result, 
not  procedure.  Therefore  if  two  structures  resulting  from  say 
principal  components  -  varimax  analyses  of  data  for  two  groups,  do 
not  resemble  each  other  initially,  it  may  be  possible  to  rotate  one 
into  close  approximation  of  the  other.  In  that  case,  even  though  the 
transformation  matrix  indicated  substantial  rotation  was  necessary, 
the  researcher  should  conclude  that  the  factor  structures  are  similar. 
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In  summary  then,  a  number  of  approaches  have  been 
developed  for  assessing  the  degree  of  factorial  similarity.  Whenever 
indices  such  as  proposed  by  Burt,  Tucker,  Wrigley  and  Neuhaus,  and 
Kaiser  are  employed,  no  basis  is  given  for  determining  the  minimum 
value  for  the  coefficient  in  order  to  define  congruent  factors.  If  one 
resorts  to  Ahmavaara's  method  of  plotting  the  transformed  matrix 
against  the  target  matrix,  then  the  problem  of  what  constitutes  a 
significant  plot  is  encountered.  If  average  sum  of  squares  of  the 
difference  matrix  are  computed,  the  researcher  has  no  idea  as  to  the 
magnitude  of  these  values.  Researchers  who  have  relied  on  the  values 
in  the  transformation  matrix  as  a  basis  for  assessing  structural 
stability  have  not  reached  agreement  as  to  what  value  determines 
whether  factors  are  stable  across  samples  or  variables. 

Although  a  number  of  solutions  have  been  offered  for  factor 
matching  and  testing  whether  matches  are  significant,  the  question  of 
what  constitutes  a  satisfactory  match  still  prevails.  Rather  than 
using  the  transformation  matrix  of  direction  cosines  as  a  guideline 
in  assessing  the  significance  of  the  matches,  the  present  study  shifts 
the  emphasis  to  the  difference  obtained  between  the  rotated  and  the 
target  matrices.  Schonemann  adopted  the  same  emphasis  and  in 
order  to  bring  about  a  more  efficient  match,  set  the  criterion  that 
the  difference  between  the  rotated  and  the  target  matrix  be  a  minimum. 
Thus  given  a  source  matrix  A^  and  a  target  matrix  A an  orthogonal 
transformation  matrix  T  is  applied  to  A^  to  rotate  it  as  close  as 
possible  to  A^.  In  matrix  form  this  can  be  expressed  as 

A^T  =  A^  +  E 
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where  E  is  the  matrix  of  differences  between  the  rotated  and  the 
target  matrix  A^.  By  least- square s  principles,  E  should  be  a 
minimum.  Schonemann's  solution  requires  the  derivation  of  a 
unique  transformation  matrix  in  order  that  the  criterion  be  fulfilled. 

However,  even  if  one  resorts  to  using  Schonemann's 
procedure,  the  problem  of  whether  the  factors  in  the  two  samples 
are  similar  still  exists.  This  is  so  because  no  known  attempt  has 
been  made  at  assessing  the  sampling  distributions  of  goodness  of  fit 
statistics  as  put  forth  by  the  various  methods  for  assessing  factorial 
similarity. 

Therefore,  if  a  match  is  performed,  it  would  be  worthwhile 
to  know  whether  or  not  the  two  factor  solutions  are  significantly 
different.  A  researcher  would  then  be  enabled  to  conclude  with 
certain  confidence  that  the  two  factor  solutions  are  derived  from 
subjects  from  the  same  population. 

SUMMARY 

The  comparison  of  one  set  of  factor  loadings  to  another  set 
has  been  isolated  as  one  of  the  problems  facing  researchers.  Studies 
employing  various  techniques  for  matching  factors  have  been 
reviewed  and  the  inconsistency  in  deciding  whether  one  set  of  factors 
matches  another  was  demonstrated. 

PURPOSE  OF  THE  STUDY 

The  purpose  of  the  present  study  was  to  develop  an 
empirical  sampling  distribution  from  one  type  of  factor  match 
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procedure  and  to  extend  the  principles  of  inferential  hypothesis  testing 
into  the  factor  analysis  area.  Components  were  matched  according  to 
the  procedure  developed  by  Sch'Onemann.  For  each  match  average 
tr(E'E)  and  the  "largest  absolute  value"  of  the  error  matrix  were 
calculated.  Based  on  the  null  hypothesis  that  two  samples  are  drawn 
at  random  from  a  population  exhibiting  a  particular  component  structure, 
an  attempt  was  made  to  develop  a  distribution  of  the  average  tr(E'E) 
of  the  error  matrix.  The  25,  50,  75,  90,  95,  and  99  percentile  points 
were  computed  for  the  various  distributions  in  an  attempt  to  provide 
some  basis  for  assessing  satisfactory  matching. 


CHAPTER  II 


SOME  RELATED  LITERATURE 

The  present  chapter  covers  three  related  aspects  of  the 
literature:  a  brief  history  of  factor  analysis,  the  component  analysi 
model,  and  factorial  invariance.  The  latter  is  dealt  with  in  more 
detail  than  the  former  two. 

A  BRIEF  HISTORY  OF  FACTOR  ANALYSIS 

Although  factor  analysis  has  been  used  extensively  in 
psychology,  it  is  not  a  psychological  theory  but  rather  a  branch  of 
statistics.  Harman  (1967)  has  credited  Spearman  with  the  primary 
development  of  factor  analysis. 

Horst  (1965)  summarized  Spearman's  contribution.  At  the 
turn  of  the  1900's  considerable  debate  arose  as  to  the  presence  of 
general  and  special  abilities  in  the  cognitive  domain.  Spearman 
hypothesized  that  intelligent  behavior  or  any  cognitive  variable  was 
composed  of  a  general  ability  factor  and  a  factor  unique  to  each 
variable.  Several  years  later,  Spearman  applied  correlational 
procedures  to  data  in  an  attempt  to  verify  his  hypothesis  of  the  two- 
factor  theory  of  intelligence. 

It  was  soon  realized  that  the  two-factor  theory  was  not 
adequate  to  describe  a  battery  of  psychological  measures.  For 
instance  some  tests  were  found  to  have  a  factor  in  common,  in  addi¬ 
tion  to  the  general  factor,  that  was  not  common  to  other  tests.  To 
compensate  for  this  inadequacy,  Holzinger  (Harman,  1967)  modified 
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the  two-factor  theory.  Ho.lzi.nger 1  s  bi-factor  method  obtained  a 
general  and  one  or  more  group  factors.  A  variable  or  test  was 
described  in  terms  of  a  general  factor,  a  group  factor,  and  a  unique 
factor. 

Thurstone  (1947)  extended  the  work  of  Spearman  and  like 
Holzinger,  favored  a  theory  of  many  common  factors.  As  a  conse¬ 
quence,  Thurstone1  s  method  of  multiple  factor  analysis  was  popular¬ 
ized.  For  Thurstone,  factor  analysis  had  two  different  tasks.  A 
factor  analytic  study  should  condense  the  test  scores  by  expressing 
them  in  terms  of  a  relatively  small  number  of  linearly  independent 
factors  or  it  could  discover  the  underlying  processes  which  produce 
the  test  performances  and  describe  the  individual  differences  in  terms 
of  the  underlying  processes.  At  the  same  time  if  the  factor  model 
was  to  express  the  underlying  processes,  Thurstone  stated  a  number 
of  requirements  to  be  fulfilled  by  the  model.  Firstly,  the  model 
should  be  parsimonious,  that  is,  it  should  use  as  small  a  number  of 
concepts  with  as  great  a  scope  as  possible.  Secondly,  the  factors 
should  be  invariant.  This  means  that  factors  should  not  be  dependent 
on  just  one  grouping  of  variables  and  individuals  that  are  being 
analyzed  at  that  moment,  but  should  appear  in  analyses  under 
different  experimental  circumstances.  Lastly,  the  model  should 
display  uniqueness  of  results.  One  configuration  of  factors  should 
occur  over  repeated  investigations  with  different  test  batteries 
within  the  same  domain. 

In  summary  then,  "The  principal  concern  of  factor  analysis 
is  the  resolution  of  a  set  of  variables  linearly  in  terms  of  (usually)  a 
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small  number  of  categories  or  'factors'.  This  resolution  can  be 
accomplished  by  the  analysis  of  the  correlations  among  the  variables. 
A  satisfactory  solution  will  yield  factors  which  convey  all  the  essen¬ 
tial  information  of  the  original  set  of  variables.  "  (Harman,  1967, 
p.  4). 


B.  THE  COMPONENT  ANALYSIS  MODEL 


The  aim  of  factor  analysis  is  to  express  a  score  on  a  test  or 
variable  in  terms  of  several  underlying  factors.  The  simplest  mathe¬ 
matical  model  for  describing  a  variable  in  terms  of  several  others  is 
linear.  However,  Harman  (1967)  indicated  that  a  researcher  has 
several  alternatives  available  within  the  linear  framework.  An 
analysis  could  be  performed  which  would  best  reproduce  the  correla¬ 
tions  between  the  variables  or  one  which  would  extract  the  maximum 
variance  from  the  variables.  The  latter  alternative  formulates  the 
basis  for  this  study.  In  1933,  Hotelling  developed  the  method  of 
principal  component  analysis.  Algebraically  this  model  can  be 
represented  as: 


Z.. 

Ji 


a.  .F,i 

P  1 


+ 


a.9F9. 

j2  2i 


+ 


a  F9. 
j3  3i 


+  .  . 


+  a . 
Jr 


F  . 
r  l 


where 


Z..  -  standard  score  for  person  i  on  test  i 

JL  e  j 

a^  -  component  loading  of  test  j  on  component  r 

F^.  -  component  score  for  person  i  for  component  r 

The  observed  variables  (Z..)  are  now  described  linearly  in  terms  of 

Ji 

components  (F^.).  Successive  components  made  a  maximum  contri¬ 


bution  to  the  sum  of  the  variances  of  the  variables  that  are  residual 


. 
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after  preceding  components  have  been  extracted. 

Since  the  present  study  is  concerned  only  with  orthogonal 
solutions  and  the  extraction  of  "r"  components,  a  parsimonious 
expression  of  the  above  model  could  be  represented  in  matrix  notation 
as : 


where 


Z  =  A  F 


Z  =  standard  score  matrix  of  order  (n  x  N) 

A  =  a  matrix  of  component  loadings  of  order  (n  x  r) 
F  =  a  matrix  of  component  scores  of  order  (r  x  N) 

and 


N  =  number  of  observations 
n  =  number  of  variables 
r  =  number  of  components 

Computationally,  analysis  proceeds  from  the  correlation  matrix  R, 
(n  x  n)  rather  than  the  observed  variables.  Thus  if  Z  =  AF  is  post- 
multiplied  by  its  transpose  and  then  divided  by  N,  the  number  of 
observations,  the  correlation  matrix  R  is  formed. 


Z  =  A  F 

Z  Z'  =  A  F  F'  A' 
N  N 


If  the  component  scores  are  scaled  with  zero  mean  and  unit  variance 

and  are  orthogonal,  then  FF1  =  I,  the  identity  matrix,  and  the 

N 

expression  reduces  to  R  =  AA1.  The  matrix  A  is  then  determined 
from  a  principal  component  analysis  of  R. 
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TRAN  SFORMATION  S 

Since  no  unique  mathematical  solution  exists  for  the  unique 
definition  of  A,  transformations  are  usually  carried  out  on  A  in  an 
attempt  to  meet  the  criterion  of  simple  structure  as  defined  by  Thur- 
stone  (1947).  Thus  given  a  component  matrix  A,  it  may  be  rotated  by 
postmultiplication  by  a  transformation  matrix  T  to  form  B.  Hence 

B  =  A  T, 

where  T'T  =  TT'  =  I. 

The  matrix  of  loadings  B  should  now  meet  the  condition  of  simple 
structure.  The  transformation  matrix  T  may  be  determined  graphic¬ 
ally  or  analytically.  With  the  advent  of  computers,  T  has  generally 
been  determined  by  recourse  to  analytic  methods  such  as  varimax 
(Kaiser,  1958)  and  quartimax  (Neuhaus  and  Wrigley,  1954). 

Further  problems  are  introduced  when  oblique  rather  than 
orthogonal  transformations  of  the  axes  are  made.  The  present  study 
concerns  only  orthogonal  solutions  and  the  problems  of  oblique 
transformation  need  not  concern  us  here. 

C.  FACTORIAL  INVARIANCE 

THE  CONCEPT  OF  INVARIANCE 

Thurstone  (1947)  specified  that  in  factor  analytic  work, 
invariance  of  results  is  one  of  the  requirements.  Young  and  House¬ 
holder  (1940,  p.  53)  noted  that  "any  consistent  system  of  factor 
analysis  must  yield  the  same  values  (apart  from  errors)  for  the 
factor  loadings  of  individuals  or  tests,  regardless  of  the  combination 
in  which  they  are  presented.  "  Others  (Thurstone,  1947;  Henrysson, 
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1957;  and  Quereshi,  1957)  have  assigned  more  than  one  meaning  to 
invariance  especially  in  those  areas  of  psychology  where  factor 
analysis  is  regarded  as  the  appropriate  method  of  investigation.  In 
such  a  context,  invariance  may  have  one  of  the  following  meanings: 

(a)  consistency  of  factors,  for  a  fixed  set  of  variables, 
from  one  sample  of  individuals  to  another, 

(b)  consistency  of  factors,  for  a  fixed  sample  of  individuals, 
from  one  set  of  variables  to  another, 

(c)  consistency  of  factors  when  both  samples  and  variables 
are  methodically  varied,  and 

(d)  consistency  of  factors,  for  a  fixed  sample  of  individuals 
and  a  fixed  set  of  variables,  from  one  method  of  analysis 
to  another. 

The  present  study  is  concerned  with  the  first  meaning  ascribed 
to  invariance. 

NATURE  OF  THE  PROBLEM 

The  problem  of  factorial  invariance  as  stated  by  Hunka 
(1967)  stems  from  the  application  of  the  factor  analytic  model  to 
observed  data.  Since  no  unique  mathematical  solution  exists  for  the 
model,  the  problem  of  determining  when  the  same  factors  are 
present  in  two  different  studies  is  a  major  concern. 

Problems  involving  estimation  of  factorial  invariance  may 
fall  into  one  of  four  categories  where  each  category  is  defined  by 
(a)  the  nature  of  the  sample  and  (b)  the  nature  of  the  variables.  The 
four  conditions  thus  encountered  and  writers  proposing  solutions 
for  each  are  presented  in  figure  1. 
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VARIABLES 


FIXED 


VARIABLE 


FIXED 


Wr igley -Neuhaus  (1955)  Wrigley-Neuhaus  (1955) 

Tucker  (1958) 


SAMPLE _ A _ B_ 

Burt  (1948)  No  adequate  solution 

Tucker  (1951) 

Ahmavaara  ( 1954)  (1958) 

VARIABLE  Wr  igley -Neuhaus  (1955) 

Kaiser  ( 19 60) 

Schonemann  (1966) 

C_ D 

Figure  1 

Types  of  Invariance  and  Proposed  Solutions 

SOLUTIONS  TO  FACTORIAL  INVARIANCE 

Although  four  types  of  invariance  solutions  can  be  distinguished, 
the  only  one  of  concern  here  is  situation  C  -  that  of  fixed  tests  and  vari¬ 
able  samples.  In  addition  only  those  solutions  using  orthogonal  trans¬ 
formations  will  be  discussed. 

If  the  same  test  is  administered  to  two  different  samples, 
then  configurational  invariance  may  take  place.  Here  the  change  in 
samples  affects  the  size  of  the  loadings,  but  in  proportion  to  the 
changes  in  the  variance  of  the  tests  administered  to  the  different 
samples.  Hence  if  one  sample  has  less  variance  in  some  of  the  vari¬ 
ables,  this  will  result  in  a  corresponding  reduction  in  the  size  of  the 
loadings.  Although  the  numerical  values  are  changed,  the 
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configuration  of  the  loadings  should  remain  the  same  and  the  factors 
should  be  the  same  in  all  the  analyses. 

A  number  of  attempts  have  been  made  to  devise  means  for 
testing  the  type  of  invariance  discussed  above.  In  general,  the  problem 
could  be  approached  along  two  lines.  One  could  determine  the  rela¬ 
tionship  between  the  factors  as  vectors  in  the  test  space  when  the  test 
variables  are  made  to  maximally  overlap  (Kaiser,  I960)  or  one  could 
determine  the  degree  of  similarity  of  factors  in  terms  of  factor  load¬ 
ings.  It  is  the  latter  approach  that  is  employed  in  the  present  study. 
Furthermore,  working  with  factor  loadings  and  requesting  orthogonal 
transformations  opens  another  two  avenues  of  approach.  Starting  with 
two  sets  of  factor  loadings,  each  of  these  could  be  rotated  to  a  compro¬ 
mise  position  or  one  set  could  be  defined  as  a  target  and  the  other  set 
would  be  rotated  to  maximum  congruence  with  it.  Once  again,  the 
latter  alternative  is  focal. 

As  was  previously  mentioned,  various  researchers  have 
employed  indices  to  assess  the  degree  of  factorial  similarity.  Perhaps 
one  of  the  most  common  ways  to  assess  stability  of  factors  across 
samples  is  to  calculate  the  "root  mean  square"  suggested  by  Herman 
(1967).  This  index  may  be  put  in  the  form  of 

^  /  ( 1  a  •  -  9a  •  )  /n 

Z  si  '  1  jp  2  jq; 

j=l 

where  ^a^  represents  the  factor  loading  of  test  j  on  factor  p  in  study 

one.  A  similar  interpretation  can  be  given  for  ^a-  •  This  is  a  simple 

1 9. 

type  of  index  and  really  does  not  indicate  what  comprises  an  adequate 
value  for  agreement.  Furthermore  factor  one  in  study  one  could  be 
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factor  two  in  study  two  and  the  result  would  be  an  inappropriate  pairing 
of  factors. 

Several  investigators,  Burt  (1948),  Tucker  (1951),  and  Wrigley 
and  Neuhaus  (1955),  have  employed  an  index  roughly  resembling  a 
correlation  coefficient  to  compare  loadings  of  tests  on  factors.  Adopt¬ 
ing  Tucker's  terminology  such  an  index  was  called  a  coefficient  of 
congruence  and  can  be  defined  as 


J  (A'j  Aj)  (A'2  A2) 

The  A's  represent  matrices  of  factor  loadings  for  studies  one  and  two. 
The  coefficient  has  a  range  for  -  1  to  +1.  Harman  (1967)  pointed  out 
that  this  index  is  not  a  correlation  since  the  loadings  are  not  deviation 
scores  and  the  summation  is  carried  over  the  variables  rather  than 
the  number  of  observations.  In  addition,  tests  of  significance  for  the 
index  are  not  available  and  neither  Tucker  nor  Burt  nor  Wrigley  and 
Neuhaus  commit  themselves  as  to  state  a  minimalvalue  for  the 
coefficient. 

To  test  the  quality  of  fit,  Kaiser  employed  a  mean  cosine 

index  which  can  be  represented  as 

1  - 1  1  1  - 1 
-  trace  (H,  A.  A„  H0  ) 
n  1  12  2 

where  and  represent  the  diagonal  matrices  composed  of  the 
square  roots  of  the  communalitie s  for  the  common  tests  of  the  two 
studies  and  A^  and  A^  represent  the  matrices  of  factor  loadings. 

Kaiser  also  does  not  specify  a  lower  bound  for  the  index. 

Thus  the  aspect  of  defining  indices  which  purport  to  measure 
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the  degree  of  factorial  similarity  or  stability  appears  to  be  meaning¬ 
less  if  no  guidelines  are  given  to  indicate  what  comprises  a  significant 
match.  Using  Sctonemann’s  procedure  it  is  the  hope  of  the  present 
study  to  establish  such  guidelines. 

SCHONEMANN'S  ORTHOGONAL  PROCRUSTES  SOLUTION 


Schonemann  (1966)  showed  how  a  source  matrix  of  factor 
loadings  could  be  transformed  into  a  least  squares  approximation  of  a 
target  matrix  A^  by  an  orthogonal  matrix  T.  Green  (1952)  and  Cliff 
(19  66)  have  also  provided  solutions.  Simply  put,  the  problem  is  to 
find  the  matrix  T  given  A^  and  A^.  The  situation  can  be  represented  as 


AXT  -  A2  +  E 


(1) 


where  E  is  the  error  matrix.  The  above  equation  can  be  rewritten  as 


E  =  A1T  -  A2 


(2) 


The  Schonemann  procedure  yields  a  least- square s  solution,  that  is, 

tr  (E'E)  =  minimum  (3) 

under  the  constraint 

T'T  =  TT'  =  I.  (4) 

Schonemann  went  on  to  show  that  equation  (3)  can  be  expressed  as 

g1  =  tr(E'E)  =  tr(T'A1'A1T  -  2T'A1'A2  +  A^A^  (5) 


The  side  condition  T'T  =  I  is  then  expressed  as 

g2  =  tr  (L(T*T  -  I))  (6) 

where  L  is  a  matrix  of  (unknown)  Lagrange  multipliers.  Equations  (5) 
and  (6)  are  then  expressed  as 

=  §1  +  §2 


g 


(7) 
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and  g  is  to  partially  differentiated  with  respect  to  the  elements  of  the 
matrix  T.  The  result  is 

-|fi  =  (A'1A1  +  T  -  ZA'1AZ  +  T(L  +  Id)  (8) 

If  A ^ A ^  is  represented  by  P,  A'^A^  by  S,  and  (L,  +  L')2  by  Q,  then  one 


must  solve 

S  =  PT  +  TQ  (9) 

Both  P  and  Q  are  symmetric  so  that 

Q  =  T'S  -  T1  PT  =  Q'  (10) 

Since  T'PT  is  symmetric  if  P  is,  then  T'S  must  be  symmetric  or 

T'S  =  S'T  (11) 

From  (4)  and  (11)  S  =  TS'T  and 

SS'  =  TS'  ST'  (12) 

From  S  =  A'^A^,  two  other  matrices  are  formed  such  that 

S'S  =  WDW'  (13) 

and 

SS'  =  VDV',  (14) 


where  D  is  the  diagonal  matrix  of  latent  roots  of  both  S'S  and  SS',  W 
is  the  matrix  of  unit-length  latent  vectors  of  S'S,  and  V  is  the  matrix 
of  unit-length  latent  vectors  of  SS'. 

Since  SS'  =  TS'ST',  SS'  can  be  substituted  with  WDW'  and 
S'S  with  VDV'  yielding  the  equation 

WDW'  =  TVDV'T'  (15) 

Solving  for  W  then  yields 

W  =  TV  (16) 

or 


T  =  W  V' 


(17) 
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Once  T  has  been  calculated,  it  is  applied  to  matrix  A^.  The  result 
A^T  represents  the  best  approximation  of  the  target  matrix  A^.  The 
difference  between  A  T  and  A  is  computed  and  represents  the  error 

I  L 

matrix,  E.  Since  the  transformation  matrix  T  is  unique  it  satisfies 
the  criterion  that  tr(E'E)  is  a  minimum. 

Obtaining  tr(E'E)  and  the  fact  that  it  is  a  minimum  is  a 
worthwhile  criterion  in  comparing  one  set  of  factor  loadings  to 
another.  Instead  of  assessing  the  values  within  the  transformation 
matrix,  an  over  all  assessment  of  the  approximation  of  the  source 
matrix  to  the  target  matrix  can  be  obtained.  This  over-all  assess¬ 
ment  is  given  by  the  single  value  of  tr(E'E). 

However,  the  single  value  of  tr(E'E)  offers  very  little  know¬ 
ledge  as  to  the  "quality  of  fit"  of  the  source  to  the  target  matrix.  A 
sampling  distribution  for  the  average  tr(E'E)  would  be  helpful  in  assess¬ 
ing  the  "quality  of  fit"  of  the  two  data  sets.  It  is  the  intent  of  the 
present  study  to  determine  a  sampling  distribution  for  the  average 
tr(E'E)  values  when  the  Orthogonal  Procrustes  solution  is  used  for 
assessing  factorial  similarity. 

SUMMARY 

In  this  chapter  several  concepts  of  component  analysis  and 
factor  matching  were  reviewed.  Various  indices  to  measure  the  degree 
of  factorial  similarity  were  presented.  Finally,  the  procedure  devised 
by  Schonemann  was  described.  His  technique  incorporated  an  ortho¬ 
gonal  transformation  matrix  and  matches  a  set  of  factor  loadings 
called  a  source  to  another  set  of  loadings  called  the  target  matrix. 

An  outline  of  the  procedure  will  be  presented  in  Chapter  III. 


CHAPTER  III 


PROCEDURE 

Twenty  sets  of  one  thousand  random  var iable s  were  generated. 
Each  set  was  approximately  normally  distributed  with  a  mean  of  fifty 
and  a  standard  deviation  of  five.  A  correlation  matrix  (R)  of  order 
twenty  by  twenty  was  calculated  and  factored,  from  which  twenty  uncor¬ 
related  sets  of  one-thousand  component  scores  each  were  produced. 

Each  set  had  a  mean  of  zero  and  a  standard  deviation  of  one.  This 
matrix  of  component  scores  (F)  was  of  order  one-thousand  by  twenty 
and  served  as  the  source  for  the  samples  used  in  the  study.  Various 
component  structure  matrices  (A)  of  order  n  variables  by  r  components 
were  selected  from  the  literature  for  use  as  the  population  matrices. 

By  multiplying  Z  =  A  F1  we  would  have  in  Z,  the  scores  of  1,  000  people 
on  n  variables  such  that  if  a  component  analysis  were  performed  on  the 
correlation  matrix  derived  from  the  scores  of  Z,  an  approximation  of 
the  structure  matrix  A  would  be  returned.  For  the  remaining  part  of 
this  section  an  A  matrix  of  order  five  variables  by  three  components 
will  be  used  as  an  example  to  add  clarity  to  the  procedure. 

A 

Two  samples  of  fifty  component  scores  (F)  each  were  drawn 
with  replacement  from  the  first  three  columns  of  the  population  matrix 
(F).  These  two  samples  of  order  fifty  observations  by  three  compo- 

A  A 

nents  were  designated  as  F^  and  F^.  For  each  sample  of  fifty  compo¬ 
nent  scores,  a  product  matrix  was  produced  by  forming  Z  =  A  F1.  Thus 

A 

for  the  two  samples  drawn  each  F  was  premultiplied  by  the  A  matrix 
of  order  five  variables  by  three  components  resulting  in 
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and 


zi  =  AFi 


Z2  =  A  F2 


where  both  and  were  of  the  order  fifty  observations  by  five  vari¬ 
ables.  The  procedure  thus  far  is  based  on  the  idea  that  two  samples  of 
size  fifty  drawn  from  Z  =  A  F'  (that  is,  the  population)  is  the  same  as 
drawing  two  samples  of  F  and  forming  Z  =  A  F1. 

For  each  2  thus  formed,  correlations  were  computed  among 

A 

the  five  variables  to  produce  a  correlation  matrix  (R)  of  order  five 
variables  by  five  variables.  A  component  structure  matrix  (A)  was 

A  A  A 

calculated  for  each  Z  matrix.  Thus  for  Z^  a  corresponding  A^  was 

A 

calculated  and  likewise  an  A^  for  Z^.  Each  A  was  of  the  order  five 

A 

variables  by  three  components,  and  the  two  A's  were  produced  by  two 
samples  coming  from  the  same  population.  Employing  Schonemann1  s 
procedure  the  two  component  structure  matrices  and  A^  were 
matched.  Additional  factor  score  samples  of  order  fifty  by  three  were 

A 

drawn  pairwise  and  the  A  matrix  was  used  to  produce  additional  Z 
matrices  which  were  factored  and  compared  until  one  thousand  matches 
had  been  performed. 

For  each  match  the  largest  absolute  value  of  the  error  mat¬ 
rix  and  average  tr(E'E)  were  computed.  Average  tr(E'E)  is  defined 
by 


,  nr  2 

1  v  y  e. . 

L  U 


nr  i  j 


where 


e  =  elements  of  the  error  matrix 
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n  =  number  of  variables 
r  =  number  of  components. 

Hence  for  every  A  matrix  used,  one-thousand  "largest  absolute  value" 
and  one-thousand  average  tr(E'E)  values  of  the  error  matrix  were 
obtained.  These  one-thousand  values  were  an  empirical  sampling 

/V  A 

distribution  given  the  true  null  hypothesis  that  and  A^  are  both 
estimates  of  a  common  A  matrix.  Table  1  shows  the  various  A 
matrices  which  were  used  in  the  study.  References  for  the  A  matrices 
may  be  found  in  Appendix  A. 

For  several  of  the  A  matrices,  the  procedure  was  repeated 
using  sample  sizes  of  100  and  150.  In  such  cases  only  500  matches 
were  performed  rather  than  the  1000.  Matrices  subjected  to  the 
increased  sample  sizes  are  presented  in  Table  2. 

Frequency  distributions  of  the  "largest  absolute  value"  and 
the  average  tr(E'E)  values  for  the  different  A  matrices  were  obtained. 
For  each  distribution,  the  25,  50,  75,  90,  95,  and  99  percentile  points 
were  calculated.  In  addition  the  maximum  value  of  the  "largest  abso¬ 
lute  value"  and  the  average  tr(E'E)  values  were  obtained.  Cumulative 
frequency  polygons  for  the  average  tr(E'E)  for  the  5x3,  10  x  4, 

16  x  3,  and  20  x  6  matrices  were  plotted.  Frequency  polygons  were 
also  plotted  for  several  of  the  matrices  subjected  to  the  increased 
sample  sizes.  For  such  cases,  plots  appear  for  Harman's  5x3, 
Ohnmacht's  16  x  3,  and  Evanechko's  20  x  6  factor  structure  matrices. 
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TABLE  1 

ORDER  AND  NUMBER  OF  A  MATRICES 


USED  BY  THE  STUDY 

V  ar iable  s 

Components 

1  2  3  4  5 

6 

5 

4 

6 

2 

7 

8 

2 

9 

10 

2 

11 

1 

12 

2 

1 

13  1  1 

14 

15 

16  2 

17 

18  1  1 

19 

20  2 
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TABLE  2 

MATRICES  SUBJECTED  TO  SAMPLE  SIZES 
OF  100  AND  150 


Order 

Reference 

5x3 

Harman 

Morrison 

8x2 

Harman 

Harman 

10x4 

Mosychuk 

Noble 

16  x  3 

Hallworth 

Ohnmacht 

20  x  6 

Hallworth 

Evanechko 

SUMMARY 

An  outline  of  the  procedure  incorporated  by  the  study  was 
presented  in  the  present  chapter.  The  study  adopts  for  its  basis  a 
randomly  generated  factor  score  matrix  of  order  one  thousand  obser¬ 
vations  by  twenty  variables  and  twenty-two  different  A  matrices  rang¬ 
ing  in  order  from  five  by  three  to  twenty  by  six. 

Results  of  the  study  will  be  presented  in  the  following 

chapter. 


CHAPTER  IV 


RESULTS 


Table  3  shows  the  percentile  points  and  maximum  value  for 
average  tr(E'E)  of  the  various  A  matrices  used  in  the  study.  Identical 
percentile  point  values  calculated  for  the  "largest  absolute  value"  in 
the  error  matrix  are  presented  in  Table  4.  Similar  information  is 
presented  in  Tables  5  and  6  for  the  A  matrices  whose  sample  sizes 
were  increased  to  100  and  150.  Percentile  values  for  a  sample  size 
of  50  are  also  included  in  these  tables  for  comparative  purposes. 

Figures  2,  3,  4,  and  5  are  the  cumulative  frequency  poly¬ 
gons  for  average  tr(E'E)  for  the  5  by  30,  10  by  4,  16  by  3,  and  20  by  6 
A  matrices  respectively.  Frequency  polygons  for  the  same  A  matrices 
appear  in  Figures  6,  7,  8,  and  9.  Frequency  polygons  for  data 
subjected  to  sample  sizes  of  50,  100,  and  150  appear  in  Figures  10, 

11,  and  12  respectively. 

DISCUSSION  OF  THE  RESULTS 

If  the  Orthogonal  Procrustes  solution  is  used  as  the  factor 
matching  technique,  then  Tables  3,  4,  5,  and  6  should  provide  some 
guidelines  as  to  the  "quality  of  fit"  of  two  data  sets.  Of  practical 
importance  are  the  values  obtained  for  the  95  and  99  percentiles  and 
maximum  values  of  average  tr(E'E)  and  the  "largest  absolute  value" 
of  the  error  matrix.  At  the  95  percentile,  values  for  average  tr(E'E) 
range  from  0.  0074  to  0.  0145.  At  the  99  percentile  the  values  range 
from  0.  0112  to  0.  0232  while  the  maximum  average  tr(E'E)  values 
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TABLE  3 

SELECTED  PERCENTILE  POINTS  AND  MAXIMUM  VALUE  OF  THE 
CUMULATIVE  FREQUENCY  FOR  AVERAGE  tr(E'E) 


(Leading  decimal  points  omitted) 


A  Matrix  Source 

Order 

Percentile 

Maximum 
V  alue 

25 

50 

75 

90 

95 

99 

Harman 

5x3 

0015 

0029 

0054 

0089 

0119 

0189 

0330 

Hunk  a 

5x3 

0023 

0041 

0069 

0104 

0129 

0195 

0300 

Morrison 

5x3 

0011 

0022 

0038 

0059 

0074 

0117 

0225 

Morrison 

5x3 

0027 

0046 

0075 

0104 

0132 

0202 

0345 

Morrison 

6x3 

0021 

0038 

0064 

0095 

0116 

0147 

0180 

Morrison 

6x3 

0011 

0023 

0039 

0060 

0075 

0122 

0165 

Harman 

8x2 

0009 

0023 

0049 

0086 

0111 

0178 

0420 

Harman 

8x2 

0008 

0018 

0037 

0063 

0087 

0131 

0210 

Mosychuk 

10x4 

0041 

0061 

0084 

0112 

0133 

0175 

0225 

Noble 

10x4 

0031 

0045 

0065 

0087 

0102 

0138 

0195 

Kraus  &  Walker 

1 1x6 

0049 

0064 

0082 

0102 

0113 

0133 

0210 

Hogg 

12x4 

0019 

0030 

0045 

0061 

0074 

0112 

0195 

Eyre 

12x4 

0026 

0042 

0067 

0093 

0113 

0148 

0240 

Lovell  &  Gorton 

12x4 

0035 

0046 

0060 

0074 

0086 

0112 

0135 

Glass  &  Maguire 

13x3 

0030 

0050 

0070 

0117 

0145 

0232 

0315 

Morrison 

13x4 

0022 

0034 

0050 

0068 

0082 

0118 

0240 

Hallworth 

1 6x3 

0030 

0051 

0081 

0121 

0145 

0210 

0300 

Ohnmacht 

1 6x3 

0016 

0028 

0045 

0069 

0085 

0118 

0225 

Taylor  &  Maguirel8x3 

0014 

0026 

0044 

0067 

0091 

0141 

0  240 

Walberg 

18x5 

0046 

0063 

0086 

0113 

0131 

0161 

0240 

Hallworth 

20x6 

0045 

0060 

0076 

0095 

0110 

0152 

0210 

Evanechko 

20x6 

0058 

0073 

0093 

0114 

0127 

0149 

0210 
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TABLE  4 

SELECTED  PERCENTILE  POINTS  AND  MAXIMUM  VALUE  OF  THE 
CUMULATIVE  FREQUENCY  FOR  "LARGEST  ABSOLUTE 
VALUE"  IN  THE  ERROR  MATRIX 


(Le 

ading 

decimal  points 

omitted  ) 

A  Matrix  Source 

Order 

Percentile 

Maximum 

V  alue 

25 

50 

75 

90 

95 

99 

Harman 

5x3 

0842 

1176 

1580 

1983 

2364 

2812 

3600 

Hunka 

5x3 

1040 

1426 

1837 

2246 

2538 

3193 

4200 

Morrison 

5x3 

0691 

0976 

127  6 

1597 

1824 

2456 

3000 

Morrison 

5x3 

1135 

1528 

1950 

2375 

2760 

3300 

4200 

Morrison 

6x3 

1175 

1621 

2124 

2598 

2912 

3471 

4050 

Morrison 

6x3 

0891 

1278 

1737 

2187 

2543 

3337 

4500 

Harman 

8x2 

0736 

1215 

1816 

2534 

2953 

3825 

5250 

Harman 

8x2 

0  696 

1145 

1 660 

2154 

2561 

3100 

4050 

Mosychuk 

10x4 

1625 

2030 

2460 

2954 

3276 

3870 

4350 

Noble 

10x4 

1658 

2088 

2620 

3176 

3529 

4350 

5100 

Kraus  &  Walker 

11x6 

1919 

2222 

2593 

2977 

3246 

3600 

4650 

Hogg 

12x4 

1446 

1849 

2283 

2830 

3142 

3900 

4800 

Eyre 

12x4 

1517 

1925 

2379 

2875 

3270 

3925 

4950 

Lovell  &  Gorton 

12x6 

1812 

2222 

2683 

3207 

3573 

4387 

5700 

Wiggens  &  Lovell 

13x3 

1426 

1848 

2335 

2875 

3246 

3950 

5250 

Morrison 

13x4 

1403 

1774 

2248 

2680 

3028 

3690 

4650 

Hall  worth 

1 6x3 

1371 

1765 

2209 

2724 

2982 

3525 

4800 

Ohnmacht 

1 6x3 

1508 

2120 

2854 

3620 

4217 

5137 

6750 

Taylor  &  Maguire 

18x3 

1258 

1668 

2234 

2902 

3339 

4400 

5400 

Walberg 

18x5 

2088 

2511 

3019 

3515 

3867 

4500 

5850 

Hallworth 

20x6 

2069 

2486 

2909 

3435 

3700 

4512 

4950 

Evanechko 

20x6 

2161 

2532 

2997 

3219 

3702 

4320 

5550 

• 
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TABLE  5 

SELECTED  PERCENTILE  POINTS  AND  MAXIMUM  VALUE 
FOR  AVERAGE  tr(E'E)  FOR  VARIOUS  SAMPLE  SIZES 


(Leading  decimal  points  omitted) 


Percentile  \\ 

11 

ii 

1 

1 

Sample  Size 

i 

5x3 

8  x 

Order  and  Reference 

2  10  x  4  16  x 

:  3 

20  : 

x  6 

Harman 

Morr  ison 

Harman  (E) 

Harman  (P) 

Mosychuk 

Noble 

Hallworth 

Ohnmacht 

Hallworth 

Evanechko 

50 

0015 

0027 

0008 

0009 

0041 

0031 

0030 

0016 

0045 

0058 

25 

100 

0005 

0014 

0005 

0006 

0019 

0013 

0015 

0007 

0021 

0027 

150 

0004 

0008 

0005 

0005 

0012 

0073 

0084 

0005 

0015 

0019 

50 

0029 

0046 

0018 

0023 

0061 

0045 

0051 

0028 

0060 

0073 

50 

100 

0012 

0024 

0010 

0012 

0028 

0022 

0025 

0014 

0028 

0037 

150 

0009 

0017 

0009 

0010 

0020 

0015 

0017 

0011 

0021 

0024 

50 

0054 

0075 

0037 

0049 

0084 

0065 

0081 

0046 

0076 

0093 

75 

100 

0019 

0037 

0017 

0026 

0041 

0030 

0040 

0025 

0039 

0046 

150 

0013 

0027 

0014 

0014 

0028 

0024 

0028 

0017 

0027 

0030 

50 

0089 

0104 

0063 

0086 

0112 

0087 

0121 

0069 

0095 

0114 

90 

100 

0030 

0055 

0033 

0045 

0055 

0043 

0059 

0038 

0049 

0057 

150 

0019 

0040 

0024 

0027 

0037 

0029 

0041 

0026 

0031 

0040 

50 

0119 

0132 

0087 

0111 

0133 

0102 

0145 

0085 

0110 

0127 

95 

100 

0039 

0068 

0042 

0063 

0064 

0052 

0072 

0048 

0056 

0062 

150 

0026 

0045 

0031 

0036 

0042 

0036 

0050 

0029 

0039 

0043 

50 

0189 

0202 

0131 

0178 

0175 

0133 

0210 

0118 

0152 

0149 

99 

100 

0060 

0097 

0063 

0095 

0086 

0067 

0107 

0067 

0072 

0074 

150 

0039 

0066 

0054 

0063 

0053 

0045 

0072 

0041 

0045 

0049 

-  £ 

K/< 

50 

0330 

0345 

0210 

0420 

0225 

0210 

0300 

0225 

0210 

0210 

S  d 

100 

0075 

0135 

0110 

0165 

0105 

0075 

0120 

0090 

0105 

0090 

150 

0075 

0105 

0090 

0090 

0075 

0090 

0090 

0045 

0075 

0060 
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TABLE  6 

SELECTED  PERCENTILE  POINTS  AND  MAXIMUM  VALUE  OF 
"LARGEST  ABSOLUTE  VALUE"  IN  THE  ERROR  MATRIX 

FOR  VARIOUS  SAMPLE  SIZES 
(Leading  decimal  points  omitted  ) 


Percentile  !' 

ii 
1 1 

u 

II 

Sample  Size 

ii 

5  x 

3 

8  x 

2 

Order 

10 

and  Reference 

x  4  16  x  3 

20 

x  6 

Harman 

Mor  r  ison 

Harman  (E) 

Harman  (P) 

Mosychuk 

Noble 

Hallworth 

Ohnmacht 

Hallworth 

Evanechko 

50 

0842 

1135 

0696 

0736 

1625 

1658 

1371 

1508 

2068 

2161 

25 

100 

0483 

0809 

0460 

0515 

1118 

1095 

0929 

1056 

1470 

1536 

150 

0380 

0673 

0439 

0405 

0910 

0897 

0772 

0897 

1214 

1159 

50 

1 176 

1528 

1145 

1215 

2030 

2088 

1765 

2120 

2486 

2532 

50 

100 

0668 

1050 

0754 

0856 

1386 

13  69 

1236 

1500 

1723 

1782 

150 

0544 

0885 

0652 

0661 

1163 

1165 

1022 

2182 

1422 

1448 

50 

1580 

1950 

1660 

1816 

2460 

2620 

2209 

2854 

2909 

2997 

75 

100 

0893 

1374 

1113 

1286 

1711 

1732 

1571 

1991 

2060 

2097 

150 

0723 

1133 

0941 

0963 

1384 

1459 

1267 

1800 

1656 

1662 

50 

1983 

2375 

2154 

2534 

2954 

3176 

2724 

3620 

3435 

3419 

90 

100 

1150 

1721 

1517 

1800 

2057 

2118 

1934 

2686 

2377 

2421 

150 

0879 

1442 

1279 

1344 

1601 

1795 

1546 

2166 

1905 

1884 

50 

2365 

2765 

2461 

2953 

3276 

3529 

2982 

4217 

3700 

3702 

95 

100 

1312 

1940 

1815 

2088 

2378 

2400 

2170 

3125 

2675 

2655 

150 

0986 

1586 

1562 

1775 

1762 

1967 

1746 

2342 

2111 

2060 

50 

2812 

3300 

3100 

3825 

3870 

4350 

3525 

5137 

4512 

4320 

99 

100 

1620 

2300 

2280 

2650 

2925 

3037 

2490 

3825 

3125 

3100 

150 

1275 

1850 

1970 

2190 

2062 

2550 

2130 

2850 

247  5 

2400 

1 

•  fpj  C 

50 

3600 

4200 

4050 

5250 

4350 

5100 

4800 

6750 

4950 

5550 

x  5 

100 

1950 

2700 

2700 

4050 

3300 

3600 

2850 

4500 

3900 

3450 

S  6 

150 

1650 

2700 

3150 

2550 

2550 

3000 

2400 

3450 

3150 

2850 

CUMULATIVE  PERCENTAGE 
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FIGURE  2  CUMULATIVE  POLYGONS  FOR  AVERAGE  TRACE  E'E 


VALUES  USING  VARIOUS  5  BY  3  A  MATRICES 


CUMULATIVE  PERCENTAGE 
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A  -  Noble 
B  -  Mosychuk 


FIGURE  3  CUMULATIVE  FREQUENCY  POLYGONS  FOR  AVERAGE  TRACE  E'E 


VALUES  USING  VARIOUS  10  BY  4  A  MATRICES 


CUMULATIVE  PERCENTAGE 
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FIGURE  4  CUMULATIVE  FREQUENCY  POLYGONS  FOR  AVERAGE  TRACE  E'E 


VALUES  USING  VARIOUS  16  BY  3  A  MATRICES 


CUMULATIVE  PERCENTAGE 


35 


A 


o 


FIGURE  5  CUMULATIVE  FREQUENCY  POLYGONS  FOR  AVERAGE  TRACE  E'E 


VALUES  USING  VARIOUS  20  BY  6  A  MATRICES 


FREQUENCY 
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FIGURE  6  FREQUENCY  POLYGONS  FOR  AVERAGE  TRACE  E'E 


37.50 


VALUES  USING  VARIOUS  5  BY  3  A  MATRICES 


FREQUENCY 
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FIGURE  7  FREQUENCY  POLYGONS  FOR  AVERAGE  TRACE  E'E 


VALUES  USING  VARIOUS  10  BY  4  A  MATRICES 


FREQUENCY 
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FIGURE  8  FREQUENCY  POLYGONS  FOR  AVERAGE  TRACE  E'E 


VALUES  USING  VARIOUS  16  BY  3  A  MATRICES 


FREQUENCY 
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FIGURE  9  FREQUENCY  POLYGONS  FOR  AVERAGE  TRACE  E’E 


VALUES  USING  VARIOUS  20  BY  6  A  MATRICES 


FREQUENCY 
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FIGURE  10  FREQUENCY  POLYGONS  FOR  AVERAGE  TR  E'E  VALUES  OF  HARMAN  5  BY  3 


MATRIX  USING  SAMPLE  SIZES  OF  50  100  AND  150 


FREQUENCY 
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FIGURE  11  FREQUENCY  POLYGONS  FOR  AVERAGE  TR  E'E  VALUES  OF  OHNMACHT  16  BY  3 


MATRIX  USING  SAMPLE  SIZES  OF  50  100  AND  150 


FREQUENCY 
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FIGURE  12  FREQUENCY  POLYGONS  FOR  AVERAGE  TR  E'E  VALUES  OF 
EVANECHKO  20  BY  6  MATRIX  USING  SAMPLE  SIZES 


OF  50  100  AND  150 
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over  the  twenty-two  A  matrices  range  from  0.  0135  to  0.  0420.  For 
the  "largest  absolute  value"  in  the  error  matrix,  the  95  percentile 
values  range  from  0.  1824  to  0.  14217  while  for  the  99  percentile  range 
from  0.  25456  to  0.  5137.  The  maximum  "largest  absolute  value"  for 
the  error  matrix  range  from  0.  3000  to  0.  6750. 

Observation  of  the  average  tr(E'E)  and  "largest  absolute 
value"  values  revealed  no  trend  (stability  of  average  tr(E'E)  or 
"largest  absolute  value")  either  over  the  twenty-two  A  matrices  or 
over  matrices  which  were  of  the  same  order.  The  lack  of  trend  was 
further  enhanced  by  observation  of  the  frequency  polygons  depicted 
in  Figures  6,  7,  8  and  9. 

Although  graphs  were  not  presented  for  all  of  the  matrices, 
a  rough  plot  of  all  twenty-two  matrices  revealed  that  all  the  curves 
would  be  positively  skewed.  Since  the  criterion  for  the  Orthogonal 
Procrustes  solution  required  that  average  tr(E'E)  be  a  minimum  then 
positively  skewed  distributions  would  be  expected. 

A  plausible  explanation  for  variations  in  the  distribution 
would  be  the  use  of  different  A  matrices.  Since  the  A  matrices  were 
chosen  without  any  specific  criteria,  aside  from  an  unrotated  or  a 
varimax  rotation,  variations  would  be  dependent  upon  the  correlation 
matrix  which  produced  the  corresponding  A  matrices.  Hence  one 
would  have  the  intuitive  feeling  that  Harman’s  correlation  matrix  of 
five  socio-economic  variables  would  produce  a  different  factor 
structure  matrix  than  Hunk  a’ s  correlation  matrix  of  five  textbook 
variables.  Each  correlation  matrix  is  characterized  by  two 
characteristics --composition  of  the  sample  and  composition  of  the 
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variables.  Composition  of  the  sample  could  be  further  partialled  into 
two  facets:  number  and  type  of  people.  An  increase  in  the  sample 
size  and  whether  the  people  were  homogeneous  or  heterogeneous  with 
respect  to  some  entity  would  affect  the  correlation  matrix  and  conse¬ 
quently  the  component  structure.  Similar  changes  in  the  composition 
of  the  variables  would  also  affect  the  correlation  matrix.  Therefore, 
if  the  parameters  of  the  correlation  matrix  are  changed, 
characteristics  of  the  component  structure  matrix  will  also  be  changed. 

Tables  5  and  6  presented  the  changes  in  the  magnitude  of 
average  tr(E'E)  and  the  "largest  absolute  value"  for  ten  of  the  twenty- 
two  A  matrices  when  sample  size  was  increased  from  50  to  100  and 
150.  At  the  95  percentile  and  sample  sizes  of  50,  100,  and  150, 
average  tr(E'E)  values  ranged  from  0.  0085  to  0.  0145,  0.  0048  to 
0.  0072,  and  0.  0026  to  0.  0050  respectively.  Values  at  the  99  percent¬ 
ile  and  identical  samples  sizes  ranged  from  0.  0131  to  0.  0210,  0.  0060 
to  0.  0107,  and  0.  0039  to  0.  0072  and  maximum  average  tr(E'E)  values 
ranged  from  0.  0210  to  0.  0345,  0.  0075  to  0.  0165,  and  0.  0045  to 
0.  010  5.  The  range  for  the  "largest  absolute  value"  at  the  95 
percentile  and  sample  sizes  of  50,  100,  and  150  were  0.  2365  to 
0.  4217,  0.  1312  to  0.  3125,  and  0.  0986  to  0.  2342  for  the  respective 
sample  sizes.  Values  ranged  from  0.  2812  to  0.  5137,  0.  1620  to 
0.  3825,  and  0.  1275  to  0.  2850  at  the  99  percentile  and  from  0.  3600 
to  0.  6750,  0.  1950  to  0.  4500,  and  0.  1650  to  0.  3450  for  the  maximum 
value.  Hence  values  for  average  tr(E'E)  and  "largest  absolute 
value"  decreased  in  magnitude  with  increases  in  sample  size.  This 
conclusion  strengthened  previous  statements  that  increases  in  the 
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sample  size  would  affect  the  correlation  matrix.  An  increase  in  the 
sample  size  yields  a  better  estimate  of  whatever  population  para¬ 
meter  is  being  measured.  Factoring  these  correlation  matrices 
produces  less  error  and  thereby  the  better  the  goodness  of  fit 
between  pairs  of  factor  structures.  Figures  10,  11,  and  12  show  the 
trend  of  a  decrease  in  the  magnitude  of  average  tr(E'E)  with  an 
increase  in  the  sample  size. 


SUMMARY 

The  results  of  the  study  were  reported  in  the  present 
chapter.  Tables  were  presented  to  indicate  various  percentile  points 
and  maximum  values  for  average  tr(E'E)  and  "largest  absolute  value" 
for  twenty-two  different  A  matrices.  Percentile  points  were  also 
reported  for  ten  of  the  A  matrices  which  were  subjected  to  increased 
sample  sizes.  In  the  Discussion  of  the  Results,  an  attempt  was  made 
to  summarize  points  of  practical  importance.  In  addition  to  the 
numerical  results,  several  graphs  were  included  to  give  an  indica¬ 
tion  of  the  distribution  and  shape  of  the  average  tr(E'E)  values  for 
several  of  the  A  matrices. 

Summary,  conclusions,  and  recommendations  of  the  study 
will  be  presented  in  the  final  chapter. 


CHAPTER  Y 


SUMMARY,  CONCLUSIONS,  AND  RECOMMENDATIONS 

SUMMARY 

In  the  present  study  an  attempt  was  made  at  determining  an 
empirical  sampling  distribution  from  one  type  of  factor  match  tech¬ 
nique --the  Orthogonal  Procrustes  solution.  One -thousand  matches 
were  performed  for  each  of  twenty-two  different  A  matrices. 
Frequency  distributions  of  average  tr(E'E)  and  "largest  absolute 
value"  were  computed  for  each  set  of  one  thousand  matches.  The  25, 
50,  75,  90,  95,  and  99  percentile  points  were  calculated  for  each 
distribution.  In  addition  to  these  values  the  maximum  value  for 
distributions  was  calculated.  The  same  procedure  was  repeated  for 
several  A  matrices  which  were  subjected  to  sample  size  increases. 
Percentile  points  and  maximum  value  for  average  tr(E'E)  and 
"largest  absolute  value"  were  reported.  Several  graphs  of  average 
tr(E'E)  for  some  of  the  A  matrices  were  also  plotted. 

CONCLUSIONS 

There  is  a  certain  amount  of  constraint  imposed  on 
generalizations  from  the  present  study.  Because  a  particular  set  of 
methodological  procedures  were  implemented  and  a  certain  set  of 
parameters  held  constant,  generalization  of  results  from  this  study 
to  empirical  studies  employing  other  methods  should  be  undertaken 
with  caution.  Different  methods  of  factoring  and  factor  matching  as 
well  as  different  number  of  variables  and  components  could  produce 
different  results  from  those  reported  here. 
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If  factor  structure  matrices  are  a  result  of  a  principal 
component  analysis  and  are  either  unrotated  or  rotated  to  the  varimax 
criterion  and  later  subjected  to  an  Orthogonal  Procrustes  solution  for 
factor  matching,  results  of  this  study  could  be  used  as  guidelines  in 
evaluating  the  quality  of  fit.  Although  no  probabilities  can  be  asso¬ 
ciated  with  the  various  percentile  levels  reported,  it  would  appear 
that  an  average  tr(E'E)  value  greater  than  0.  0232  (99  percentile) 
would  indicate  that  the  pair  of  factor  structures  are  different.  In 
addition,  average  tr(E'E)  values  at  the  95  percentile  (0.  0145),  90 
percentile  (0.  0121),  and  the  75  percentile  (0.  0093)  may  be  used  for 
assessing  the  significance  of  factorial  similarity.  The  above 
mentioned  average  tr(E'E)  values  for  the  various  percentiles  result 
from  a  sample  size  of  fifty.  Table  5  reveals  that  sample  size 
increases  would  decrease  the  magnitude  of  tr(E'E)/nr. 

Some  researchers  might  want  to  use  the  "largest  absolute 
value"  as  an  index  of  similarity  between  two  factorial  structures.  A 
"largest  absolute  value"  greater  than  0.  5137  (99  percentile)  would 
show  that  the  two  structures  are  different.  "Largest  absolute  values" 
at  the  95,  90,  and  75  percentile  levels  may  be  used  as  guidelines  in 
determining  whether  the  structures  are  similar.  Researchers  using 
the  "largest  absolute  value"  as  a  measure  of  factorial  similarity 
should  be  aware  of  the  effect  of  increased  sample  sizes  upon  the 
magnitude  of  the  "largest  absolute  value.  " 

Even  though  it  is  possible  to  have  factorial  similarity  and 
an  average  tr(E'E)  greater  than  0.  023  or  a  "largest  absolute  value" 
greater  than  0.  5137,  heuristic  evidence  from  this  study  would 
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suggest  that  the  structures  are  different.  If  the  results  of  the  present 
study  are  used  in  the  interpretation  of  say  the  Taylor  and  Maguire 
data  and  we  adopt  the  90  percentile  as  our  level  of  confidence,  then 
the  null  hypothesis  of  "no  major  differences  exist  between  the  groups' 
perceptions  of  science  objectives"  is  rejected.  The  conclusion  is 
based  on  the  fact  that  the  observed  average  tr(E'E)  value  of  0.  015  is 
greater  than  the  critical  value  of  0.  012  at  the  90  percentile. 

RECOMMENDATIONS 

Whereas  the  results  of  the  present  study  have  been  encour- 
aging,  several  modifications  could  be  implemented  for  an  extension 
of  research  in  this  area.  Sample  sizes  larger  than  150  could  be  used 
in  the  analysis. 

Since  the  results  of  the  study  are  specific  to  the  A  matrices 
used,  A  matrices  could  be  produced  randomly.  Kaiser  and  Dickman 
(19  62)  and  Cliff  and  Pennell  (1967)  have  outlined  methods  for  generat¬ 
ing  sample  correlation  matrices  based  on  a  population  correlation 
matrix.  Results  thus  obtained  could  be  compared  with  the  findings 
of  the  present  study.  Since  the  effect  of  sample  size  cannot  be 
ignored  in  the  present  study,  the  next  appropriate  step  would  be  to 
determine  an  F-distr ibuted  variate  in  which  the  effect  of  sample  size 
is  removed.  Furthermore,  determining  the  statistic  under  which 
average  tr(E'E)  is  distributed  would  also  be  worthwhile. 
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